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ABSTRACT. The use of analytic methods for extracting learning strategies from trace data has 
attracted considerable attention in the literature. However, there is a paucity of research 
examining any association between learning strategies extracted from trace data and responses 
to well-established self-report instruments and performance scores. This paper focuses on the 
link between the learning strategies identified in the trace data and student reported approaches 
to learning. The paper reports on the findings of a study conducted in the scope of an 
undergraduate engineering course (N=144) that followed a flipped classroom design. The study 
found that learning strategies extracted from trace data can be interpreted in terms of deep and 
surface approaches to learning. The detected significant links with self-report measures are with 
small effect sizes for both the overall deep approach to learning scale and the deep strategy 
scale. However, there was no observed significance linking the surface approach to learning and 
surface strategy nor were there significant associations with motivation scales of approaches to 
learning. The significant effects on academic performance were found, and consistent with the 
literature that used self-report instruments showing that students who followed a deep 
approach to learning had a significantly higher performance. 
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1 INTRODUCTION 

The field of learning analytics evolved from the increased opportunities to collect and make use of data 
about learning and learning contexts (known as trace or log data) (Gasevic, Dawson, & Siemens, 2015). 
Although the field is driven by two underlying principles — to understand and to optimize learning and 
learning environments in which learning occurs — very little research to date has acutely addressed 
them (Siemens & Gasevic, 2012). In early days of learning analytics, much attention was dedicated to 
the prediction of learning success. This was primarily motivated by the easy access to data that could be 
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used for predictive modelling and interest in both optimizing institutional processes and in increasing 
educational and monetary benefits for learners and educational providers (Colvin et al., 2015). Recent 
research in learning analytics however recognizes the significance of building upon educational theory in 
order to enable the use of advance machine learning methods to model behavioural, cognitive, and 
social processes associated with learning (Dawson, Drachsler, Rose, Gasevic, & Lynch, 2016). 

1.1 Learning Analytics and Learning Theory 

Several authors have recently argued that in order to advance research and practice in learning analytics 
there is a critical need to connect and deepen such analytics with learning theory (Gasevic, Dawson, 
Rogers, & Gasevic, 2016; Lodge & Lewis, 2012; Rogers, Gasevic, & Dawson, 2016; Wise, 2014; Wise & 
Shaffer, 2015). For example, Gasevic, Dawson, Rogers, and Gasevic (2016) suggest "a theoretically driven 
approach [that] leads to an ontologically deep engagement with intentions and causes, and the 
validation of models of learning, learning contexts, and learner behavior" (p. 70). Furthermore, Gasevic 
et al. empirically show that instructional conditions need to be accounted for when examining the 
association between digital trace data and learning outcomes in order to make actionable insights into 
student learning progress. The importance of theory has also been explored in other studies such as the 
use of theory-informed mechanisms to develop learning analytics that support teacher regulation of 
collaborative groups (van Leeuwen, 2015), and examination of the use of effective study practices such 
as spacing effect (Miyamoto et al., 2015) and revisiting previously studied resources (Svihla, Wester, & 
Linn, 2015). 

The use of existing theory offers many benefits related to opportunities to improve study designs, 
inform selection of relevant variables and hypotheses formulation, enhance interpretation of the study 
findings, facilitate comparisons of the results with respect to already published findings, and enable 
replication of previous studies (Gasevic et al., 2015; Wise & Shaffer, 2015). A common recommendation 
is that studies involving the use of digital traces and learning analytics methods should start from an 
existing theory to inform their research questions and operationalize the measurements, and thus 
establish the use of trace data as valid proxies of constructs under study. This approach is already 
gaining much traction in the field of learning analytics and can be used as an effective way to study 
different complex concepts such as motivation (Zhou & Winne, 2012) and study strategy (Lust, Elen, & 
Clarebout, 2013b). 

1.2 Self-Reported Measures and Learning Analytics 

Although recent literature demonstrates some promising results stemming from the connection of 
learning theory with learning analytics, some tensions need to be further investigated. The conventional 
research in the learning sciences makes extensive use of self-report instruments. According to Azevedo 
(2015), self-reports, in addition to classroom discourse, are the only proven approach that can be used 
for the measurement of cognitive, metacognitive, affective, and motivational constructs of student 
engagement. This provides the rationale for making use of existing self-report instruments to interpret 
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and triangulate findings obtained through the use of trace data (Beheshitha, Hatala, Gasevic, & 
Joksimovic, 2016; Lust et al., 2013b). 

Associations between trace and self-reported data on the same construct are not consistently observed. 
For example, Winne and Jamieson-Noel (2002) showed that learners are inaccurate in calibrating their 
self-reported and actual measures of the use of specific study tactics. Their study demonstrated that 
learners have a tendency to overestimate the use of specific study tactics. According to Zhou and Winne 
(2012) this inaccuracy in self-reports is likely due to poor learner reflection. As the authors stated, 
"...accounts may be based, in part, on biased information arising from incomplete and reconstructed 
memories plus subjective and implicit theories of the mental processes involved" (p. 414). Moreover, 
the Zhou and Winne (2012) study showed that trace data-based measures of student achievement goal 
orientation had much stronger associations with learning outcomes than self-reported ones. The 
authors interpret this finding as the difference between perceived intention and actual behaviour. The 
self-reported data measured student intentions while trace data measured realized intentions and 
allowed for collection of finer grain data points that were more proximal to the actual learning 
experiences. Thus, trace data had lower bias than that arising "from incomplete and reconstructed 
memories" (Zhou & Winne, 2012, p. 414). 

Combined use of trace data and self-reported measures is a new avenue of research recently reported in 
the literature. Pardo, Ellis, and Calvo (2015) explored how the conclusions derived from quantitative 
data derived from digital traces and self-reported qualitative data can be related. They concluded that 
the combined approach may lead to changes in learning designs not previously considered when only 
using one of the two data sources. In another study, Pardo, Han, and Ellis (2016) explored statistical 
models that combine self-reported measures of self-regulation, and digital traces extracted from the 
logs recorded by an online platform. Both studies point to the need to expand conventional analysis 
techniques to combine self-reported data sources with those derived from trace data recorded by online 
learning platforms. 

1.3 Learning Analytics and Learning Strategy 

The study reported in this paper looks at student learning strategies, opportunities for their 
measurement with trace data, associations with existing self-reported instruments of relevance, and 
effects of study strategies on learning outcomes. According to Weinstein, Husman, and Dierking (2000, 
p. 227) a learning strategy includes "any thoughts, behaviors, beliefs or emotions that facilitate the 
acquisition, understanding or later transfer of new knowledge and skills." Making effective choices and 
adaptation of learning strategies in response to the emerging needs from the learning environment are 
critical features of effective self-regulated learning. Such features are especially important in 
technology-enhanced environments where a high degree of self-regulated learning is necessary for 
learning success. However, existing research indicates that learners 1) tend to use ineffective learning 
strategies (Winne & Jamieson-Noel, 2003), and 2) do not make effective use of available resources to 
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optimize their learning even in the environments that build on effective learning designs (Ellis, Marcus, 
& Taylor, 2005; Lust, Elen, & Clarebout, 2013a). 

A common approach to identifying learning strategies in learning analytics uses unsupervised methods 
for the analysis of trace data that capture activities of learners of relevance for learning designs in 
different contexts. Generally, studies have identified three to six learning strategies evolving from 
student use of online resources (Del Valle & Duffy, 2009; Kovanovic, Gasevic, Joksimovic, Hatala, & 
Adesope, 2015; Lust et al., 2013a; Wise, Speer, Marbouti, & Hsiao, 2013). For example, Lust, 
Vandewaetere, Ceulemans, Elen, and Clarebout (2011) reported three strategies found to be used by 
undergraduate educational sciences students in a blended course. These strategies included 1) no-users, 
who had very limited use of the online resources and did not use any of the provided face-to-face tools, 
2) intensive users who regularly made use of the tools provided in the course design, and 3) incoherent 
users who only used online tools and did not engage with any of the face-to-face tools provided in the 
course design. Moreover, several studies also report significant associations between learning 
strategies, derived from trace data, and learning outcomes. For example, Lust et al. (2013b) reported 
that the adopted learning strategy had a significant moderate effect on student academic performance 
in an undergraduate educational sciences blended learning course. Kovanovic et al. (2015) showed that 
learning strategy had a significant and large effect on the quality of knowledge construction evolving 
from online discussions in a fully online software engineering master's course. 

Learning strategies reported in these studies are typically interpreted with respect to established 
theories such as approaches to learning (Trigwell & Prosser, 1991), goal orientations (Elliot & McGregor, 
2001), and self-efficacy (Zimmerman, 2000). However, the majority of studies collected only the trace 
data related to the constructs of these theories. In contrast to this trend, Lust et al. (2013b) collected 
both trace data and self-reports about achievement goal orientations and self-efficacy. Self-reported 
data were then used to identify associations with strategies identified from trace data, and thus offer 
interpretations of the identified strategies. 

1.4 Research Aim 

The study reported in this paper examines the association between student approaches to learning 
(Biggs, 1987) and study strategies extracted from digital trace data about learner interactions with 
online learning resources. Approaches to learning are well-studied in the educational literature and offer 
a wealth of insights that can inform educational practice and research. Approaches to learning are 
referred to as either deep or surface. Deep learning reflects an ideal of modern education and is 
indicative of conceptual change. In contrast surface learning is typically associated with rote learning 
and memorization. Several studies indicate that students with high tendency towards deep approaches 
to learning have significantly higher academic performance than students with a high inclination 
towards surface approaches (Bliuc, Ellis, Goodyear, & Piggott, 2010; Ellis, Goodyear, Calvo, & Prosser, 
2008). Trigwell, Prosser, and Waterhouse (1999) also identified an association between instructor 
conceptions of teaching and student approaches to learning. That is, students, in classes taught by 
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instructors whose conception of teaching was conceptual change had a higher tendency towards a deep 
approach to learning. Conversely, students with a high tendency towards a surface approach to learning 
were more frequently observed in classes taught by instructors whose conception of teaching was 
knowledge transmission. 

The literature that conceptualizes approaches to learning connects the roles of motivation and strategy 
to promote deep learning. This is best reflected in the well-known self-report instrument used for the 
measurement of approaches to learning, which has four main subscales: deep motive (DM), deep 
strategy (DS), surface motive (SM), and surface strategy (SS), whereby DM and DS measure deep 
approaches to learning, while SM and SS measure surface approaches to learning (Biggs, Kember, & 
Leung, 2001). This conceptualization, composed of motivation and strategy components, makes 
approaches to learning suitable for the study of the association between self-reported approaches and 
the strategies identified from trace data. 

Specifically, this study looks at the following research questions: 

RQ1. Can we identify groups of learners based on learning strategies extracted from trace data? If so, 
can the identified groups be attributed to student approaches to learning? In other words, can the 
groups mined from student learning actions be explained by student approaches to learning? 

RQ2. Are there significant differences between the identified student groups with respect to self- 
reported measures of approaches to learning? 

RQ3. Are there significant differences between students with deep and surface approaches to learning 
extracted from trace data with respect to academic achievement? 

2 METHODS 

2.1 Study Context 

The context of the study was a first-year engineering course in computer systems at an Australian 
research-intensive higher education institution. The course lasted 13 weeks and enrolled 290 students 
(81.5% male, 18.5% female). The flipped learning (FL) strategy of the course consisted of two key 
elements (Pardo & Mirriahi, in press): 1) a set of preparatory learning activities to be completed prior to 
the face-to-face session with the instructor (i.e., the lecture); and 2) a redesigned lecture framed as an 
active learning session requiring student preparation and participation in collaborative problem solving 
tasks. 

The study focused on the lecture preparation activities. These activities were considered essential for 
enabling students to participate effectively in the face-to-face sessions and therefore were crucial for 
the overall success of the FL design. Specifically, the preparation activities included 1) short videos that 
introduced and explained relevant course concepts, 2) multiple-choice questions (MCQs) that followed 
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each video and covered the concepts discussed in the video; they were offered as formative assessment 
promoting simple factual recall; 3) reading materials with embedded MCQs; these questions were 
conceptualized in the same way and served the same formative role as MCQs accompanying course 
videos; 4) problem (exercise) sequences that served as summative assessment. While working on these 
activities, students had access to an analytics dashboard offering them real-time feedback on their 
engagement level and activity scores (Khan & Pardo, 2016). The dashboard was updated every 15 
minutes, and the magnitudes were reset each week. 

2.2 Data Sources and Variables 

The study incorporated three data sources. The first was the Study Process Questionnaire (SPQ) aimed 
at assessing student approaches to learning in a given learning context (Biggs et al., 2001). Since it was 
administered at the beginning of the course, it provided insight into the extent to which student learning 
approaches differed in the given teaching context. The questionnaire contained 20 questions with 
answers based on a seven-point Likert scale (from strongly disagree to strongly agree). The questions 
were organized into four groups measuring the following four constructs: deep motive (DM), deep 
strategy (DS), surface motive (SM), and surface strategy (SS). To compute values of the variables 
corresponding to these constructs, we averaged answers to the questions related to each construct. In 
addition, as suggested by Biggs et al. (2001), the Deep Approach (DA) variable was computed by 
averaging the values of DM and DS variables, whereas the Surface Approach (SA) variable was calculated 
as the average of the SM and SS variables. The SPQ-based variables were essential for addressing our 
research questions. However, a proportion of enrolled students (N no . surV ey=146) did not complete the SPQ 
questionnaire. As such the analyses were only based on the data related to the students who did 
complete the questionnaire (N survey =144). 

The second data source included trace data related to the students' preparatory learning activities 
during the active period of the 2014 delivery of the course (weeks 2-13). These data were collected 
from the Learning Management System (LMS) used in the course. Learning sessions were extracted from 
the trace data as logs of continuous sequences of events where any two consecutive events were within 
30 minutes of one another (Khan & Pardo, 2016). This resulted in 6,196 learning sessions for the 144 
students (who filled in SPQ) and the 12 active weeks of the course. These learning sessions were 
encoded as sequences of learning actions, based on the sequence representation format of the 
TraMineR R package (Gabadinho, Ritschard, Mueller, & Studer, 2011) that was used for the exploration 
and subsequent clustering of the learning sequence. Examples of actions that form learning sequences 
included formative assessment done correctly, formative assessment done incorrectly, asking to see the 
solution for a formative assessment item, watching a course video, accessing a page with the course 
reading content, and the like. 

The LMS also served as the data source for student assessment results (scores on the midterm and final 
exams). The midterm and final exam scores are numerical variables with values in the range [0-20] and 
[0-40] respectively. 
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2.3 Data Analysis 

Clustering was used for grouping similar learning sequences (N=6,196) to detect patterns in student 
learning behaviour (i.e., adopted learning strategies), and subsequently for grouping students (N=144) 
based on the identified sequence patterns (i.e., learning strategies). In both cases, we used 
agglomerative hierarchical clustering, based on Ward's algorithm. This clustering technique was 
suggested as particularly suitable for detecting student groups in online learning contexts (Kovanovic et 
al., 2015). 

Learning sequences were clustered based on their similarity computed using the optimal matching 
method. Being a variant of the Levenshtein's (1966) edit distance metric, this method computes 
distance between any two learning sequences as the minimal cost, in terms of insertions, deletions 
and/or substitutions of learning actions, required for transforming one sequence into another 
(Gabadinho et al., 2011). 

Clustering of students was based on the output of the sequence clustering. In particular, features used 
for student clustering included 1) four variables, seq.clust „ i=l:4, where seq.clust , is the number of 
learning sequences in sequence cluster / for a particular student, and 2) seq.total feature representing 
the total number of learning sequences per student. 

Following the conceptualization of Biggs and colleagues (2001) of learning approaches, the identified 
student clusters were categorized into two groups reflective of deep and surface approaches to learning. 
To compare these two groups with respect to the SPQ variables (DM, DS, SM, SS, DA, and SA), Mann 
Whitney U test was used, as the variables did not meet the homogeneity of variances assumption 
required for parametric tests. The same test was used for the comparison of the two groups with 
respect to the midterm and final exam scores (these variables were not normally distributed). Cohen's d 
metric was used for assessing the effect size. Significance level was set at alpha=0.05. 

3 RESULTS 

3.1 RQ1: Student Groups with Shared Patterns in Learning Behaviour 

The cluster analyses of the extracted learning sequences (N=6,196) led to the following four cluster 
solution: 

1. Focus on formative assessment. Sequences following this pattern (N=792; 12.78% of the total 
number of extracted sequences) are characterized by the dominance of activities related to 
formative assessment, and almost complete absence of summative assessment. Interaction with the 
course reading materials is slightly present, and tends to be more prominent at the beginning of the 
learning sessions. Metacognitive evaluation activities (i.e., access to the dashboard) tend to occur 
towards the end of these learning sessions. 
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2. Summative assessment through trial and error. This pattern is the most prominent one (N=2,488; 
40.15% of all the extracted sequences) with sequences largely dominated by summative assessment 
activities that more frequently result in incorrect than in correct solutions. 

3. Studying reading materials. Sequences sharing this pattern (N=l,891; 30.52%) mainly consist of 
interactions with the class reading materials and a tiny fraction of formative assessment activities. 
These sequences tend to be shorter, and end with watching the course videos. 

4. Video watching coupled with (mostly formative) assessment. Sequences in this group (N=l,025; 
16.65%) are characterized by the large presence of video watching activities. A considerable number 
of formative assessment activities are gradually, towards the end of the sessions, substituted by 
summative assessment. Another specificity of this pattern is the presence of metacognitive activities 
at the beginning of the sessions. 

Clustering of students based on the identified clusters of learning sequences led to the solution with 
four student clusters as the best one. Table 1 describes the obtained student clusters by providing basic 
descriptive statistics (median, 25 th , and 75 th percentiles) for the five variables used for clustering 
(number of student learning sequences in each of the four sequence clusters, and the total number of 
student sequences). The table also gives descriptive statistics for the group (i.e., cluster) scores on the 
midterm exam and the final exam. 


Table 1: Summary Statistics for the Four Student Clusters: Median, 25 th , and 75 th Percentiles 




Student clusters 



1 

2 

3 

4 


(N=17; 11.80%) 

(N=38; 26.39%) 

(N=48; 33.33%) 

(N=41; 28.47%) 

Number of seq. in seq. cluster 1 

16 (12, 21) 

5.5 (3.25, 8) 

4 (3, 6) 

1 (0, 2) 

Number of seq. in seq. cluster 2 

21 (19, 23) 

19 (17, 22) 

17 (14, 20) 

14 (11, 17) 

Number of seq. in seq. cluster 3 

32 (25, 37) 

18 (14, 20) 

10 (8, 12) 

5 (3, 6) 

Number of seq. in seq. cluster 4 

10 (9, 16) 

11 (9, 14) 

5 (3, 7.25) 

2(1,4) 

Total number of sequences 

76 (74, 87) 

54.5 (49.25, 59) 

36 (33, 41.25) 

23 (19, 26) 

Midterm exam score 

16 (13, 17) 

16 (13.25, 17) 

14 (11, 16) 

11 (10, 15) 

Final exam score 

26 (19, 32) 

27.5 (15.25, 31) 

17 (12.75, 23.5) 

15 (11, 21) 


According to Biggs et al. (2001), students can be differentiated based on their approaches to learning: 
the deep approach is characterized by critical evaluation and syntheses of information, and driven by 
intrinsic motivation, whereas the surface approach is dominated by shallow cognitive strategies and is 
associated with extrinsic motivation. Students from the first two clusters presented in Table 1 can be 
characterized as having a deep approach to learning, since they were actively engaged with the course 
(especially students from cluster 1), and practiced a variety of learning strategies, obviously trying to 
adapt to the course requirements. The fact that these students had high exam performance indicates 
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that they tend to be successful in adapting/regulating their learning. Moreover, they were more 
engaged in the strategies proven to be effective in promoting learning than in those recognized as 
passive. For example, engagement into the formative assessment opportunities can be interpreted as 
the use of self-testing study tactics that have proven to be one of the effective desirable difficulties in 
learning (Bjork & Bjork, 2011). Reading and video watching, on the other hand, are typically reported in 
the literature as ineffective study strategies. 

Students from clusters 3 and 4 can be categorized as following a surface approach in the context of the 
examined course. In the case of cluster 4, the surface approach to learning is evident in the students' 
low engagement levels. Students from cluster 3 can be characterized as selective, performance- 
oriented, and aimed at achieving high scores through minimal engagement (evident in their primary 
focus on summative assessment — strategy 2). While in some cases this performance-oriented approach 
might lead to good exam performance, it was not the case in this study. This finding suggests that the 
ability of these students to regulate their learning was less than optimal. This finding is further 
supported by the fact that when not engaged in summative assessment, students from cluster 3 
preferred the reading strategy (strategy 3) over the two more effective learning strategies presented 
through formative assessment (strategies 1 and 4). Finally, students from clusters 3 and 4 had a 
comparatively lower number of learning sequences in comparison to students from clusters 1 and 2; this 
suggests a lower level of motivation. 

3.2 RQ2: Comparison of Observed and Perceived Approaches to Learning 

To examine the level of correspondence between student approaches to learning identified through the 
analysis of their learning sequences and their learning approaches estimated through SPQ, we first 
grouped the students from clusters 1 and 2 into the deep approach group, and clusters 3 and 4 into the 
surface approach group. Next, we compared the two groups based on the 6 variables derived from the 
student answers to the SPQ questionnaire (Table 2). Mann Whitney U tests showed statistically 
significant differences between the two groups for the Deep Strategy (DS) and Deep Approach (DA) 
variables. In particular, students from the deep approach group had significantly higher scores on the DS 
scale than students from the surface approach group: Z=2.7206, p=0.006, d=0.2267. Likewise, the deep 
approach group had significantly higher scores on the DA scale than the surface approach group: 
Z=2.2106, p=0.027, d =0.1842. 

3.3 RQ3: Academic Achievement of the Observed Deep and Surface Approach 
Groups 

To examine the observed deep and surface approach groups from the perspective of their academic 
achievement, we compared the scores of the two groups on the midterm exam and the final exam 
(Table 2, the last two rows). Mann Whitney U tests confirmed that compared to the surface approach 
group (student clusters 3 and 4, Table 1), the deep approach group (student clusters 1 and 2) had 
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significantly higher midterm exam scores (Z=4.3133, p<0.0001, d=0.3594) and final exam scores 
(Z=4.5136, pcO.OOOl, d=0.3761). 


Table 2: Summary statistics for the 6 SPQ-based variables and exam scores 
(median, 25 th , and 75 th percentiles) 


Variables 

Deep Approach group 

Surface Approach group 

Deep Strategy (DS) 

5.0 (4.0, 5.4) 

4.4 (4.0, 4.8) 

Surface Strategy (SS) 

3.6 (2.8, 4.4) 

4.0 (3.2, 4.6) 

Deep Motive (DM) 

4.6 (3.8, 5.3) 

4.4 (3.8, 5.0) 

Surface Motive (SM) 

3.0 (2.2, 3.8) 

3.0 (2.6, 3.8) 

Deep Approach (DA) 

4.7 (4.0, 5.25) 

4.4 (3.9, 4.8) 

Surface Approach (SA) 

3.4 (2.55, 4.05) 

3.5 (2.9, 4.2) 

Midterm exam score 

16 (13, 17) 

13 (11, 16) 

Final exam score 

27 (17, 31.5) 

16 (12, 21) 


4 DISCUSSION 

In addressing the first research question (RQ1), the study found four clusters of students with respect to 
their learning strategy as extracted from the trace data. Two of those clusters (1 and 2) corresponded to 
a deep approach to learning, while the remaining two (3 and 4) corresponded to a surface approach to 
learning. The clusters that corresponded to a deep approach showed a higher overall amount of activity 
compared to the clusters interpreted as having a surface approach to learning. The students in deep 
approach clusters also exhibited a good balance between the use of different strategies, effectively 
combining strategies proven to promote learning (i.e., formative assessment as a manifestation of the 
self-testing desirable difficulty) with those that are less potent (i.e., reading and video watching), as well 
as those that are more performance-oriented (i.e., strategy focused on summative assessment through 
trial and error). According to Entwistle (2009), a deep approach to learning typically involves a combined 
use of both deep and surface strategies to learning. Likewise, the literature on achievement goal 
orientation indicates that some elements of performance goal orientation are necessary for learners to 
better regulate their learning in order to meet the external standards set by the course design (Elliot & 
McGregor, 2001). Alternatively, the clusters of students characterized as those who followed a surface 
approach to learning predominantly followed a performance oriented strategy (i.e., summative 
assessment through trial and error) and demonstrated a lower overall amount of activity (i.e., likely 
lower motivation) than their peers engaged in a deep approach. 

The extraction of trace data to establish deep and surface approaches to learning complements the self- 
report instrument designed by Biggs and colleagues (2001) to measure a student's "approach to 
learning." Research question two (RQ2) was designed to further probe this assumption and test whether 
indeed there were significant differences between the groups extracted from trace data with respect to 
their responses to Biggs and colleagues' self-report instrument. The results showed a significant 
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difference between the groups for the overall self-reported deep approach (DA) scores and the self- 
reported deep strategy (DS) scores. However, no significant differences were detected with respect to 
either of the two motivation scales (DM and SM) or with regards to surface approach (SA) and surface 
strategy (SS). The lack of difference on the surface strategy may be attributed to the already mentioned 
links to Entwistle's (2009) position about the combination of deep and surface strategies in the deep 
approach to learning, and Elliot & McGregor's (2001) interpretation of performance orientation. This 
interpretation needs to be tested in future studies, especially as the effect sizes were small (Cohen's d 
was just around 0.2). 

The lack of differences between the clusters extracted from the trace data on both motivational scales 
of the self-reported instrument is less clear. Given that the instrument was administered at the 
beginning of the course, it only represented student motivation intention at a single point in time. 
However, literature on student motivation indicates that the largest proportion of variability in 
motivation and engagement is explained by within-day changes (23%) and between students (67%) 
(Martin et al., 2015). Similarly, Zhou and Winne (2012) demonstrated that real-time measurement of 
achievement goal orientation that was temporally proximal to the completion of actual learning 
activities had a much stronger association with learning outcomes than self-reported measures of 
achievement goal orientation administered at the start of the learning session. Zhou and Winne (2012) 
attributed this to the fact that self-reported measures represented only student intention, while real 
time measures of goal orientation represented realized motivation intentions. Therefore, we can 
conclude that further research is required to understand the ways that real-time measurement of 
learning motivation in general and motivation in connection to approaches to learning in particular can 
be achieved, so that advanced insights into approaches to learning based on trace data can be obtained. 

The comparison of the deep and surface approach groups extracted from trace data revealed significant 
differences in the performance scores on both mid-term and final exams of the course examined in the 
study. This finding might have been affected by student prior knowledge of the course topics; however, 
as the data related to this potentially confounding variable were not available, we were not able to 
control for it. Still, this finding is consistent with the previous literature based on self-reports and shows 
that students who follow a deep approach to learning have higher academic performance (Bliuc et al., 
2010; Ellis et al., 2008). This can inform teaching practice and be used as a foundation for a learning 
analytics tool for teachers to help them gain deeper insights into a student's approach to learning as 
revealed by the trace data. In essence, instructors could derive specific recommendations for their 
students with respect to the strategies they need to follow and corrective measures they can take to 
optimize their students' approaches to learning. This implication on teaching practice is contingent on 
the conceptions teachers may have and assumes that their conceptions of teaching are in alignment 
with a deep approach to learning (Trigwell et al., 1999). A direct impact of teacher conceptualizations 
and student trace data is the embedding of more elaborate learning designs resulting in the effective 
use of technology that promotes conceptual change. However, achievement of this impact also implies 
that instructors are cognizant of how their teaching practice can encourage either a deep or surface 
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approach to learning. Recent adoption of teaching practices that promote flipped classroom designs 
(O'Flaherty, Phillips, Karanicolas, Snelling, & Winning, 2015) and active learning (Freeman et al., 2014) 
are encouraging directions for this change to happen. 

This study is likely susceptible to the self-selection bias as it included only those students who 
completed the optional self-report instrument. Based on that, it can be argued that those students who 
completed the instruments were more motivated to complete the course and thus likely more engaged. 
To check for the impact of self-selection bias, we compared the students included in this study and 
others who did not complete the self-report survey on several variables (see Table 3). The comparison 
showed that students included in the study had significantly higher scores on the final exam (Z=-2.883, 
p=0.0038, d=0.1693) and a significantly higher total number of sequences used as indicators of overall 
engagement level (Z=-3.2505, p=0.001, d=0.1909). Midterm exam scores did not differ significantly 
between the two groups. We also identified a significant association between a variable indicating if a 
student responded to the questionnaire and the cluster the student was assigned to (when clustering 
was done with all 290 students), x 2 =13.828, p=0.003. Examining this further, using logistic regression, 
we found that the odds of responding to the questionnaire were higher for students pursuing deep 
learning approach (clusters 1 and 2) than for students following a surface approach (clusters 3 and 4). 
These findings indeed confirm the self-selection bias and warrant future, more inclusive studies. Due to 
the optional nature of self-report instruments, this task can be a conundrum to be addressed with 
conventional self-reported approaches. The use of previously mentioned real-time measures, in addition 
to the benefits related to the validity of the measurement process, could also address the self-selection 
bias and increase inclusiveness of future studies. 


Table 3: Comparisons of students who completed the SPQ and those who did not 



Group who completed 

Group who did not complete 


the SPQ (N=144) 

the SPQ (N = 146) 

Midterm exam 

14(11, 17) 

13(10.25, 16) 

Final exam 

19(14, 28) 

16(12, 22) 

Total number of learning sequences 

39(29, 55) 

32.5(21, 45.75) 


5 CONCLUSION 

While the research in learning analytics is rapidly growing, and increasing in depth and diversity, there 
remains much work in addressing the field's primary goals of both understanding and optimizing student 
learning. The study findings further illustrate that student self-report instruments largely measure 
intentions to study in lieu of realized intentions. In this context, the deficiencies associated with 
interpreting trace data are also reflected in the self-report instruments. That is, while clicks of activity 
sequences provide specific granular detail about a student's realized intentions, there remains a gap in 
connecting how these traces of digital behaviour relate to the learning process. Similarly, the self-report 
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instruments provide insight into a student's future intentions for study and hence are yet to be 
actualized and evidenced. Clearly, there is further work to undertake in merging these approaches to 
measuring learning. Such multi-faceted approaches have the potential to yield more productive insights 
into student learning and the learning context. 
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